Effectiveness of Aggregation Methods in Blog Distillation
نویسندگان
چکیده
This paper addresses the blog distillation problem, that is, given a user query find the blogs that are most related to the query topic. We model each post as evidence of the relevance of a blog to the query, and use aggregation methods like Ordered Weighted Averaging operators to combine the evidence. We show that using only highly relevant evidence (posts) for each blog can result in an effective retrieval system. We implement our methods on TREC’06 blog collection with two standard query sets of TREC’07 and TREC’08. Our experiments on the TREC’07 query set show 35% improvement in Mean Average Precision and 22% improvement in Precision@10 over the best applied fusion method to blog distillation. Similar results have been obtained on TREC’08 query set where we have 31% improvement in Mean Average Precision and 20% improvement in Precision@10 over the baseline.
منابع مشابه
FEUP at TREC 2008 Blog Track: Using Temporal Evidence for Ranking and Feed Distillation
This paper presents the participation of FEUP, from University of Porto, in the TREC 2008 Blog Track. FEUP participated in two tasks, the baseline adhoc retrieval task and the blog finding distillation task. Our approach was focused on the use of the temporal information available in the TREC Blog06 collection. For the baseline adhoc retrieval task a simple temporal sort was evaluated. In the b...
متن کاملFaceted Blog Distillation System: Find an in-Depth Blog
With the increasing of blog users, the traditional blog search can no longer meet their demands. More work should be done to accommodate the need of finding good blogs to read, besides the topicrelevant blogs. This paper focuses on the problem of an in-depth faceted blog distillation for addressing the quality aspect of the retrieval blogs. We propose a novel L-Qtf coefficient and LQE model to ...
متن کاملLinguistic aggregation methods in blog retrieval
This paper addresses the blog distillation problem, that is, given a user query find the blogs that are most related to the query topic. We model each post as evidence of the relevance of a blog to the query, and use aggregation methods like Ordered Weighted Averaging (OWA) operators to combine the evidence. We show that using only highly relevant evidence (posts) for each blog can result in an...
متن کاملCross-Lingual Blog Analysis based on Multilingual Blog Distillation from Multilingual Wikipedia Entries
The goal of this paper is to cross-lingually analyze multilingual blogs collected with a topic keyword. The framework of collecting multilingual blogs with a topic keyword is designed as the blog distillation (feed search) procedure. Mulitlingual queries for retrieving blog feeds are created fromWikipedia entries. Finally, we cross-lingually and crossculturally compare less well known facts and...
متن کاملHIT_LTRC at TREC 2010 Blog Track: Faceted Blog Distillation
This paper describes our participation in the faceted blog distillation task at Blog Track 2010. In our approach, indri toolkit is applied for basic topic relevance retrieval. Then the Maximum Entropy (ME) model is adopted to judge the relevance of each blog to specified facet. Feed faceted relevance is calculated by integrating the average relevance of all blogs within a feed and the average r...
متن کامل